Using CohortMethod to perform comparative cohort studies
LSPS through FeatureExtraction and Cyclops
Evaluating objective diagnostics
Motivating study
What is the relative risk of gastrointestinal (GI) bleeding-related hospitalization within 30 days of celecoxib vs diclofenac treatment in patients with osteoarthritis of the knee?
Indication (I): osteoarthritis of the knee
Target (T): celecoxib first-exposure
Comparator (C): diclofenac first-exposure
Outcome (O): GI-bleed hospitalization
Time-at-risk (TAR): all time after exposure initiation
Model specification: LSPS-matched Cox proportional hazards regression
What happened to the exposure cohorts? (missing indication)
Data pull
# Define which types of covariates must be constructed:covSettings <-createDefaultCovariateSettings(excludedCovariateConceptIds =c(diclofenacConceptId, celecoxibConceptId),addDescendantsToExclude =TRUE)# Pull data (no need to run)cohortMethodData <-getDbCohortMethodData(connectionDetails = connectionDetails, cdmDatabaseSchema = cdmDatabaseSchema,targetId =1,comparatorId =2,outcomeIds =77,firstExposureOnly =FALSE,removeDuplicateSubjects ="keep all",restrictToCommonPeriod =FALSE,washoutPeriod =0,exposureDatabaseSchema = cohortDatabaseSchema,exposureTable = cohortTableNames$cohortTable,outcomeDatabaseSchema = cohortDatabaseSchema,outcomeTable = cohortTableNames$cohortTable,covariateSettings = covSettings)
Simulating patient-level data from a shareable profile
J&J has kindly shared a profile of these cohorts from the Optum EHR data source (contains no patient-level information) for synthetic tutorial purposes
Copy into ucla-biostat-218/data for uniform access in class
To load profile and simulate cohortMethodData object
library(CohortMethod)simulationProfile <-readRDS(file.path(getwd(), "data", "cohortMethodDataSimulationProfile.rds"))# Population sizes used to create profilesimulationProfile$metaData$attrition %>% dplyr::select(description, targetPersons, comparatorPersons)
# A tibble: 1 × 3
description targetPersons comparatorPersons
<chr> <int> <int>
1 Original cohorts 116987 185248
simulationProfile$metaData$populationSize
[1] 302235
cohortMethodData <-simulateCohortMethodData(profile = simulationProfile,n =10000# for demonstration purposes)summary(cohortMethodData)
CohortMethodData object summary
Target cohort ID: 1
Comparator cohort ID: 2
Outcome cohort ID(s): 77
Target persons: 4123
Comparator persons: 5877
Outcome counts:
Event count Person count
77 512 492
Covariates:
Number of covariates: 90758
Number of non-zero covariate values: 5013205
Important
Creating a cohortMethodData object from a remote DB takes considerable time; please remember to save object locally.
Ellipsoid (most covariates are independent \(\ldots\) unlike reality)
From the original patient cohorts
Evaluating covariate balance
plotCovariateBalanceOfTopVariables(balance)
Reporting population characteristics
Most comparative cohort studies report select population characteristics before and after PS adjustment
DT::datatable(createCmTable1(balance))
Generalizability
PS adjustment \(\rightarrow\) make T and C more comparable
Consequence: modified population is less similar to starting data source
How different? And in what ways?
DT::datatable(getGeneralizabilityTable(balance))
Note
PS matching suggests an average treatment effect in the treated (ATT) analysis. So, getGeneralizibilityTable() automatically selected the T cohort for evaluation.
Follow-up and power
Minimum detectable relative risk (MDRR) reports a relative risk (under a simple Poisson model) for which there is >80% power to detect
computeMdrr(population = studyPop, # Should also compute under matchedPopmodelType ="cox",alpha =0.05,power =0.8,twoSided =TRUE)
Using prior: None
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Outcome model fitting status is: OK
outcomeModel
Model type: cox
Stratified: FALSE
Use covariates: FALSE
Use inverse probability of treatment weighting: FALSE
Target estimand: att
Status: OK
Estimate lower .95 upper .95 logRr seLogRr
treatment 1.18794 0.67710 2.09780 0.17222 0.2885
Using prior: None
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Using 1 thread(s)
Outcome model fitting status is: OK
outcomeModel
Model type: cox
Stratified: FALSE
Use covariates: FALSE
Use inverse probability of treatment weighting: FALSE
Target estimand: att
Status: OK
Estimate lower .95 upper .95 logRr seLogRr
treatment 1.125582 0.373762 3.497713 0.118300 0.5705
treatment * gender = FEMALE 1.090611 0.296899 3.920963 0.086738 0.6584
Tip
Include gender main-effect as well via includeCovariateIds = c(interactionCovariateIds)
Multiple TCOs
CohortMethod has been finely tuned to efficiently execute across multiple
Targets (T)
Comparators (C)
Outcomes (O) – think: negative control outcomes (NCOs)
Analyses (A) – think: TARs, matching vs stratification
Considerable work has been dedicated to provide the CohortMethod and Cyclops packages
citation("CohortMethod")
To cite package 'CohortMethod' in publications use:
Schuemie M, Suchard M, Ryan P (2024). _CohortMethod: New-User Cohort
Method with Large Scale Propensity and Outcome Models_. R package
version 5.4.0, commit 893b5445e8a92c3a118db1b9cf92db8dbccdee39,
<https://github.com/OHDSI/CohortMethod>.
A BibTeX entry for LaTeX users is
@Manual{,
title = {CohortMethod: New-User Cohort Method with Large Scale Propensity and Outcome
Models},
author = {Martijn Schuemie and Marc Suchard and Patrick Ryan},
year = {2024},
note = {R package version 5.4.0, commit 893b5445e8a92c3a118db1b9cf92db8dbccdee39},
url = {https://github.com/OHDSI/CohortMethod},
}
citation("Cyclops")
To cite Cyclops in publications use:
Suchard MA, Simpson SE, Zorych I, Ryan P, Madigan D (2013). "Massive
parallelization of serial inference algorithms for complex
generalized linear models." _ACM Transactions on Modeling and
Computer Simulation_, *23*, 10.
<https://dl.acm.org/doi/10.1145/2414416.2414791>.
A BibTeX entry for LaTeX users is
@Article{,
author = {M. A. Suchard and S. E. Simpson and I. Zorych and P. Ryan and D. Madigan},
title = {Massive parallelization of serial inference algorithms for complex generalized linear models},
journal = {ACM Transactions on Modeling and Computer Simulation},
volume = {23},
pages = {10},
year = {2013},
url = {https://dl.acm.org/doi/10.1145/2414416.2414791},
}
This work is supported in part through the National Institutes of Health and the U.S. Department of Veterans Affairs